An interactive 3D UMAP plot using plotly.


UMAP plots have become the standard for representing single-cell RNA seq data. If you are not using UMAP and still using tSNE, you should stop. Right now. UMAP plots succesfully represent not only the clusters in the high-dimensional space but also the relationships between the clusters. Which is really important when analysing development datasets.

library(plotly)
library(dplyr)
library(knitr)

Our data: a single-cell RNA seq dataset where observations are individual cells and the variables are cell’s features such as cell type (annotated by authors), the embryo age from which the cell comes from and the UMAP coordinates (which I generated for this example using a custom RNA-seq pipeline). Data also includes the variable seruat_clusters which was generated by clustering cells at the transcriptome level using Seurat.

kable( head(scatter.data))
barcode CellType age UMAP_1 UMAP_2 UMAP_3 seurat_clusters
E11.AAAGATGGTCCAGTTA-1 RPCs E11 2.5108952 -3.998575 4.401366 31
E11.AACTCAGCATGACATC-1 RPCs E11 4.0124172 -4.576830 4.572913 5
E11.AACTCTTAGAGAGCTC-1 Lens Epithelia E11 0.9536113 -5.190037 11.240079 30
E11.AAGCCGCTCATACGGT-1 RPCs E11 5.4938731 -4.728292 2.801343 20
E11.AAGGCAGAGATCCCAT-1 RPCs E11 4.3380307 -4.166779 5.262853 5
E11.ACACCAACATTTCAGG-1 RPCs E11 3.9392980 -4.948364 4.906728 5

The question we would like to answer with the interactive plot is if the UMAP captures the biological relationships of retinal development. To achieve that we could use an interactive UMAP plot annotated by cell type and age.

But first things first: colors! Set3 is pretty decent qualitative palette that I prefer over the other ones in RColorBrewer because it’s pretty and to my eyes has the most distinct set of colors among qualitative palettes.

newCols<-colorRampPalette(RColorBrewer::brewer.pal(12,'Set3')) 
# We want a different color for each cell type
mycolors = newCols(scatter.data$CellType %>% unique() %>% length() )

And wihtout further ado, our interactive UMAP using plotly:

fig <- plot_ly(scatter.data, x = ~UMAP_1, y = ~UMAP_2, z = ~UMAP_3,                
    marker = list(size = 3), color = ~CellType,                 
    colors = mycolors,                
    text=~paste("CellType:",CellType,"Age:",age,"Cluster:",seurat_clusters),                
    hoverinfo = 'text') 

fig %>% add_markers()

As you can see plotly takes the color parameter which can be mapped to a variable in the dataframe. We also provide colors for each cell type. The text argument it’s very interesting because with can include variable names in a paste which creates a complex label for each cell.